Search CORE

61 research outputs found

Active Inverse Reward Design

Author: Gleave Adam
Hadfield-Menell Dylan
Mindermann Sören
Shah Rohin
Publication venue
Publication date: 06/11/2019
Field of study

Designers of AI agents often iterate on the reward function in a trial-and-error process until they get the desired behavior, but this only guarantees good behavior in the training environment. We propose structuring this process as a series of queries asking the user to compare between different reward functions. Thus we can actively select queries for maximum informativeness about the true reward. In contrast to approaches asking the designer for optimal behavior, this allows us to gather additional information by eliciting preferences between suboptimal behaviors. After each query, we need to update the posterior over the true reward function from observing the proxy reward function chosen by the designer. The recently proposed Inverse Reward Design (IRD) enables this. Our approach substantially outperforms IRD in test environments. In particular, it can query the designer about interpretable, linear reward functions and still infer non-linear ones

arXiv.org e-Print Archive

Reducing Exploitability with Population Based Training

Author: Czempin Pavel
Gleave Adam
Publication venue
Publication date: 22/09/2022
Field of study

Self-play reinforcement learning has achieved state-of-the-art, and often superhuman, performance in a variety of zero-sum games. Yet prior work has found that policies that are highly capable against regular opponents can fail catastrophically against adversarial policies: an opponent trained explicitly against the victim. Prior defenses using adversarial training were able to make the victim robust to a specific adversary, but the victim remained vulnerable to new ones. We conjecture this limitation was due to insufficient diversity of adversaries seen during training. We propose a defense using population based training to pit the victim against a diverse set of opponents. We evaluate this defense's robustness against new adversaries in two low-dimensional environments. Our defense increases robustness against adversaries, as measured by number of attacker training timesteps to exploit the victim. Furthermore, we show that robustness is correlated with the size of the opponent population.Comment: Presented at New Frontiers in Adversarial Machine Learning Workshop, ICML 202

arXiv.org e-Print Archive

imitation: Clean Imitation Learning Implementations

Author: Belrose Nora
Emmons Scott
Ernestus Maximilian
Gleave Adam
Jenner Erik
Rocamonde Juan
Russell Stuart
Taufeeque Mohammad
Toyer Sam
Wang Steven H.
Publication venue
Publication date: 21/11/2022
Field of study

imitation provides open-source implementations of imitation and reward learning algorithms in PyTorch. We include three inverse reinforcement learning (IRL) algorithms, three imitation learning algorithms and a preference comparison algorithm. The implementations have been benchmarked against previous results, and automated tests cover 98% of the code. Moreover, the algorithms are implemented in a modular fashion, making it simple to develop novel algorithms in the framework. Our source code, including documentation and examples, is available at https://github.com/HumanCompatibleAI/imitatio

arXiv.org e-Print Archive

Adversarial Policies Beat Superhuman Go AIs

Author: Belrose Nora
Dennis Michael D.
Duan Yawen
Gleave Adam
Levine Sergey
Miller Joseph
Pelrine Kellin
Pogrebniak Viktor
Russell Stuart
Tseng Tom
Wang Tony T.
Publication venue
Publication date: 13/07/2023
Field of study

We attack the state-of-the-art Go-playing AI system KataGo by training adversarial policies against it, achieving a >97% win rate against KataGo running at superhuman settings. Our adversaries do not win by playing Go well. Instead, they trick KataGo into making serious blunders. Our attack transfers zero-shot to other superhuman Go-playing AIs, and is comprehensible to the extent that human experts can implement it without algorithmic assistance to consistently beat superhuman AIs. The core vulnerability uncovered by our attack persists even in KataGo agents adversarially trained to defend against our attack. Our results demonstrate that even superhuman AI systems may harbor surprising failure modes. Example games are available https://goattack.far.ai/.Comment: Accepted to ICML 2023, see paper for changelo

arXiv.org e-Print Archive

Surgical Treatment of Renal Cell Cancer Liver Metastases: A Population-Based Study

Author: A Thelen
Anthony T. Ruys
AS Gouw
AW Ritchie
B Escudier
Cornelis Verhoef
D Dindo
GH Mickisch
HG Poel van der
J Pfannschmidt
JC Yang
JD Maldazys
JP Kavolius
K Gupta
M Casparie
ME Gleave
MK Dineen
Nagtegaal D. Iris
Peter van Duijvendijk
Pieter J. Tanis
R Adam
R Suppiah
RJ Motzer
RJ Motzer
RJ Motzer
Robert J. Porte
SJ DiBiase
TA Aloia
Thomas M. van Gulik
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Background: To evaluate outcomes of surgical treatment in patients with hepatic metastases from renal-cell carcinoma in the Netherlands, and to identify prognostic factors for survival after resection. Renal-cell carcinoma has an incidence of 2,000 new patients in the Netherlands each year (12.5/100,000 inhabitants). According to literature, half of these patients ultimately develop distant metastases with 20% involvement of the liver. Resection of renal-cell carcinoma liver metastases (RCCLM) is performed in only a minority of patients. Hence, little is known about outcome of resectable RCCLM. Methods: Patients were retrieved from local databases of theNetherlands Task Force for Liver Surgery (14 centers) and from the Dutch collective pathology database. Survival and prognostic factors were determined by Kaplan-Meier analysis and log rank test. Results: Thirty-three patients were identified who underwent resection (n = 29) or local ablation (n = 4) of RCCLM in the Netherlands between 1990 and 2008. These patients comprise 0.5% to 1% of the total population of patients diagnosed with RCCLM in that period. There was no operative mortality. The overall survival at 1, 3, and 5 years was 79, 47, and 43%, respectively. Metachronous metastases (n = 23, P = 0.03) and radical resection (n = 19, P < 0.001) were statistically significant prognosticators of ov

University of Groningen

Erasmus University Digital Repository

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Crossref

Proceedings - University of Groningen

Springer - Publisher Connector

ARTS repository - University of Groningen

PubMed Central

EUR Research Repository

Radboud Repository

Dissertations of the University of Groningen

Recommended from our members

Transcriptional profiling identifies an androgen receptor activity-low, stemness program associated with enzalutamide resistance.

Author: Aggarwal Rahul
Alumkal Joshi J
Bailey Shawna
Beer Tomasz M
Cetnar Jeremy
Chen Yiyi
Evans Christopher P
Feng Felix Y
Foye Adam
Friedl Verena
Gleave Martin
Guan Xiangnan
Huang Jiaoti
Latour Emile
Lu Eric
Quigley David A
Reiter Robert E
Rettig Matthew
Ryan Charles J
Small Eric J
Spratt Daniel E
Stuart Joshua M
Sun Duanchen
Tabatabaei Shaadi
Thomas George V
Turina Claire B
Urrutia Joshua
Weinstein Alana S
Witte Owen N
Xia Zheng
Youngren Jack F
Publication venue: eScholarship, University of California
Publication date: 01/06/2020
Field of study

The androgen receptor (AR) antagonist enzalutamide is one of the principal treatments for men with castration-resistant prostate cancer (CRPC). However, not all patients respond, and resistance mechanisms are largely unknown. We hypothesized that genomic and transcriptional features from metastatic CRPC biopsies prior to treatment would be predictive of de novo treatment resistance. To this end, we conducted a phase II trial of enzalutamide treatment (160 mg/d) in 36 men with metastatic CRPC. Thirty-four patients were evaluable for the primary end point of a prostate-specific antigen (PSA)50 response (PSA decline ≥50% at 12 wk vs. baseline). Nine patients were classified as nonresponders (PSA decline <50%), and 25 patients were classified as responders (PSA decline ≥50%). Failure to achieve a PSA50 was associated with shorter progression-free survival, time on treatment, and overall survival, demonstrating PSA50's utility. Targeted DNA-sequencing was performed on 26 of 36 biopsies, and RNA-sequencing was performed on 25 of 36 biopsies that contained sufficient material. Using computational methods, we measured AR transcriptional function and performed gene set enrichment analysis (GSEA) to identify pathways whose activity state correlated with de novo resistance. TP53 gene alterations were more common in nonresponders, although this did not reach statistical significance (P = 0.055). AR gene alterations and AR expression were similar between groups. Importantly, however, transcriptional measurements demonstrated that specific gene sets-including those linked to low AR transcriptional activity and a stemness program-were activated in nonresponders. Our results suggest that patients whose tumors harbor this program should be considered for clinical trials testing rational agents to overcome de novo enzalutamide resistance

eScholarship - University of California